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Abstract. We prove that outer commutator words are uniformly con- 
cise, i.e. if an outer commutator word ui takes m different values in a 
group G, then the order of the verbal subgroup uj(G) is bounded by a 
function depending only on m, and not on ui or G. This is obtained as 
a consequence of a structure theorem for the subgroup ui(G), which is 
valid if G is soluble, and without assuming that uj takes finitely many 
values in G. More precisely, there is an abelian series of w(G), such that 
every section of the series can be generated by values of u) all of whose 
powers are also values of u> in that section. For the proof of this latter 
result, we introduce a new representation of outer commutator words 
by means of binary trees, and we use the structure of the trees to set up 
an appropriate induction. 

1. Introduction 

Let X be a set of symbols, to which we refer as indeterminates . In group 
theory, a word uj over X is an element of the free group having X as a free 
basis. If the expression for uj involves k different indeterminates, then for 
every group G, we obtain a function from G k to G by substituting group 
elements for the indeterminates. Thus we can consider the set G^ of all 
values taken by this function, that is, 

G u = {u(gi,. . . ,gk) | Qi G G for all i = 1,... ,k}. 

The subgroup generated by G^ is called the verbal subgroup of uj in G, 
and is denoted by uj{G). We say that two words are equivalent if they 
can be transformed into each other by simply changing the names of the 
indeterminates. Obviously, equivalent words define the same set of values 
and the same verbal subgroup. For this reason, we may assume if necessary 
that all words are defined over the countable set X = {x±, X2, ■ ■ ■}, and that 
if uj involves k indeterminates, these are given by the symbols xi, . . . , x^. 

Words which are formed by taking commutators are particularly interest- 
ing. Among them, we have the lower central words 7^, on i indeterminates, 
which are given by 

71 = xi, 7i = [ji-i,Xi] = [xi, . . . ,Xi], for i > 2, 
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and the derived words <5j, on 2* indeterminates, denned recursively by 

Sq=xi, 5i = [5i-i(xi, . . . ,x 2 i-i),5i-i(x 2 i-i +1 , ■ ■ ■ ,x 2 i)], for i > 1. 

The words ji and <5j are particular instances of outer commutator words, 
which are words obtained by nesting commutators, but using always different 
indeterminates. Thus [[xi,X2], [2:3, X4, £5], xq] is an outer commutator word, 
but the Engel word [x\, x 2 , x 2 , x 2 ] is not. 

A word u is said to be concise if, for every group G, the finiteness of 
the set G w implies that of u(G). In the 1960's, Turner-Smith published a 
couple of papers [8] related to word values and verbal subgroups, where 
he indicates that Philip Hall had conjectured that every word is concise, and 
that Hall himself had proved this for every non-commutator word (i.e. a word 
outside the commutator subgroup of the free group), and for lower central 
words. In [8], Turner-Smith showed that also derived words are concise, and 
Jeremy Wilson [S] subsequently extended this result to all outer commutator 
words. On the other hand, Hall's conjecture was eventually refuted in 1989 
by Ivanov, see [3]. 

If a word oj is concise, it is natural to ask whether conciseness can be 
expressed in a quantitative form; more precisely, provided that \G U \ = m, 
can we bound |cj(G)| by a function depending only on ml The answer to 
this question is positive, but this does not seem to be widely known among 
group theorists and, to the best of our knowledge, there is no reference in 
the literature containing this result. For this reason, we have included an 
appendix at the end of the paper in which we give two different proofs of 
this fact. Both proofs need the ultraproduct construction for groups, over 
a non-principal ultrafilter. The first one uses Los's Theorem from model 
theory, while the second one is derived directly from the definition of an 
ultraproduct, and is due to Avinoam Mann. Note that the existence of non- 
principal ultrafilters is independent of the Zermelo-Fraenkel (ZF) axioms 
for set theory; it can be proved by using the Axiom of Choice (but is not 
equivalent to it). 

The ultraproduct argument only shows the existence of bounds for concise 
words, but it does not provide any explicit expressions for these bounds. In 
the case of the commutator 72 = [xi,X2], one can use the results bounding 
the order of the derived subgroup G' in terms of the breadth (maximum size 
of a conjugacy class) of G. If G contains at most m commutators, it follows 
that: 

(i) If G is soluble, then \G'\ < m^ 5+l °^ m \ (P. Neumann and Vaughan- 
Lee, @].) 

(ii) For a general group, < m 2( 13+1 °S2 m ). (Segal and Shalev, [6].) 
More recently, Brazil, Krasilnikov and Shumyatsky [TJ have given explicit 
bounds for all lower central words and for all derived words; as a matter of 
fact, they find a single upper bound for this infinite family of words, namely 
(m!) m . The first main result of this paper shows that an even better uniform 
bound applies to all outer commutator words. 

Theorem A. Let uj be an outer commutator word and let G be a group. If 
\G W \ = m, then: 

(i) If G is soluble, \u(G)\ < 2 m ~ x . 
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(ii) If G is not soluble, \uj{G)\ < (m - l)" 1 " 1 . 

We suspect that the bounds of Theorem A might be sharpened to get close 
to the bounds given above for the word 72. Nevertheless, an examination 
of the papers giving upper bounds for \G'\ in terms of the breadth clearly 
suggests that this would be better the subject of an independent paper, 
devoted specifically to this question. For this reason, we have not attempted 
to obtain sharp bounds in this paper, and we have contented ourselves with 
the bounds of Theorem A which, on the other hand, are quite reasonable. 

Theorem A follows without much effort from the following result, which 
yields structural information about the verbal subgroup uo{G) provided that 
it is soluble (equivalently, that G is soluble), without assuming that G w is 
finite. 

Theorem B. Let uj be an outer commutator word, and let G be a soluble 
group. Then there exists a series of subgroups from 1 to uj(G) such that: 

(i) All subgroups of the series are normal in G. 

(ii) Every section of the series is abelian and can be generated by values 
of uj all of whose powers are also values of uj in that section. 

Furthermore, the length of this series only depends on the word uj and on 
the derived length ofG. 

The existence of such a series was proved in [1] for derived words, and this 
particular case is the starting point for our proof of Theorem B. However, 
dealing with an arbitrary outer commutator word w is a much more delicate 
matter, since one has to keep control of the nesting of commutators in uj, and 
then there might be problems such as the commutator of two values of uj not 
being necessarily a value of uj (contrary to the case of derived words). Our 
approach to the general case is geometric: we associate a labelled binary tree 
to every outer commutator word, a tree which reflects clearly the structure 
of the word, and which makes it easy to compare any two outer commutator 
words. Then the argument proceeds by measuring, with the help of the 
tree, how distant the word in question is from being a derived word, and 
using induction on this distance. The tree of an outer commutator word is 
introduced in Section [2j together with some related concepts that will be 
needed, and the proofs of Theorems A and B are postponed to Section [3J 

We would like to remark that our proof of Theorem A is independent of 
and provides an alternative to the proof of the conciseness of outer commu- 
tator words given by Wilson in [9]. Wilson's argument is rather intricate 
and difficult to follow, and our geometric method provides a proof which, 
we honestly believe, is much easier to understand. Also, Wilson's proof goes 
by way of contradiction, and consequently he does not obtain any explicit 
bounds. On the other hand, notice that our proof lies within ZF, contrary to 
the proof of the existence of bounds for concise words via ultraproducts. To 
end this introduction, let us say that we are highly convinced that both the 
'tree method' introduced in this paper and Theorem B may prove important 
tools for addressing other problems related to outer commutator words. 
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2. The tree of an outer commutator word 



As already mentioned, a fundamental device for the proof of Theorem 
B is to associate a labelled binary tree to every outer commutator word. 
For this purpose, we give a recursive and more formal definition of outer 
commutator words, and we use the same recursion to introduce the height 
and the labelled tree of such a word. In the following, we say that two words 
a, and (3 are disjoint if the sets of indeterminates appearing in the two words 
are disjoint. 

Definition 2.1. The set of outer commutator words, and the height and 
the labelled tree of an outer commutator word, are defined recursively as 
follows: 

(i) An indeterminate is an outer commutator of height 0, and its tree 
is an isolated vertex, labelled with the name of the indeterminate. 

(ii) If a and (3 are disjoint outer commutator words, then also u> = [a, (3] 
is an outer commutator word. The height ht(w) of the word u is 
taken to be the maximum of the heights of a and (3 plus 1, and 
the tree of oj is obtained by adding a new vertex with label oj and 
connecting it to the vertices labelled a and (3 of the corresponding 
trees of these words. 

The tree of an outer commutator word oj provides a visual way of reading 
how oj is constructed by nesting commutators, easier than writing the actual 
expression of oj by using commutator brackets. We draw these trees by 
going downwards whenever we form a new commutator, so that the vertex 
with label oj is placed at the root of the tree. Every vertex v is labelled 
with an outer commutator word, which we denote by oj v . Note that the 
indeterminates correspond exactly to the vertices of degree 1. Also, the 
height of oj coincides with the height of the tree, that is, the largest distance 
from the root to another vertex of the tree (which will be necessarily labelled 
by an indeterminate). For example, the following are the trees for the words 
74 and £3: 

X\ X2 x l x ? x 3 x i x 5 x 6 x 7 x 8 



More generally, the full tree of height h corresponds to the derived word 



All labels of the tree of an outer commutator word are completely de- 
termined, up to equivalence, by the tree itself (as a graph without labels): 
given the tree, we only need to associate an indeterminate to every vertex 
of degree 1, and then proceed downwards by labeling each vertex with the 
commutator of the labels of its immediate ascendants. 




3] J* [ x 7, Xs] 
[[x 5 ,X 6 ], [X7,xs]] 



OUTER COMMUTATOR WORDS ARE UNIFORMLY CONCISE 



5 



Observe that, if oj = [a, (3] is an outer commutator word, then the verbal 
subgroup uj(G) coincides with the commutator subgroup [a(G), (3(G)]. 

If uj is an outer commutator word, then the set G w is clearly invariant 
under conjugation by elements of G. We remark that G^ is not a subgroup 
in general; however, it has the following property. 

Lemma 2.2. Let uj be an outer commutator word. Then G w is symmetric, 
that is, x E G w implies that x~ l € G u . 

Proof. We use induction on the height of uj. If u = x\ then the result is true. 
Now assume that uj = [a, (3], where a, (3 are outer commutator words whose 
height is smaller than ht(cj). An element of G w is of the form [y,z], with 
y £ G a ,z £ Gp. Then = [y z ,z -1 ], where y z € G a because G a is 

invariant under conjugation, and z~ x £ Gp by induction. So [y, z]^ 1 € G u , 
as we wanted to prove. □ 

In the context of outer commutator words, in order to simplify the writing 
of words, it is convenient to reinterpret expressions such as [a, a] (which is 
1 in the free group to which a belongs), by replacing the second a by an 
equivalent word whose set of indeterminates is disjoint from that of the first 
a. More generally, we apply the same idea to every commutator [a, [3] in 
which a and (3 have some indeterminate in common, so that [a, f3] is a well- 
defined outer commutator word up to equivalence. Allowing this notation, 
the derived words can be defined by 5o = x\ and <5j = [<5j_i, <$j_i] for i > 1, 
and the lower central words by 71 = x\ and 7, = [71-1,71] for i > 2. Also, 
the tree corresponding to the word [73,73] is the following: 



[73,73] 

Figure 2. The tree of the outer commutator word [73,73]. 

We note that the vertices of the tree are naturally positioned in levels. 
More formally, we have the following. 

Definition 2.3. Let v be a vertex of the tree of an outer commutator word 
uj of height h. We say that v is in the i-th level of the tree if it lies at distance 
h — i from the root of the tree. 

Thus the upmost level will be level and the root will be at level h, but 
note that a vertex v at level i is not necessarily labelled with a word to v of 
height i, it might even happen that ui v is an indeterminate. 

It is also useful to associate a companion vertex to each vertex of the tree 
different from the root, defined as follows. 

Definition 2.4. Let p be a vertex of the tree of an outer commutator word 
uj, different from the root, and let u be the immediate descendant of p. Then 
the companion of p is the only other vertex q of the tree which has u as an 
immediate descendant. 
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It is clear that companion vertices lie on the same level of the tree. 

As said in the introduction, we will prove Theorem B for a general outer 
commutator word ui by induction on the 'distance' of ui to the closest derived 
word. We make this notion of distance precise in the following definition. 

Definition 2.5. Let u be an outer commutator word of height h. Then the 
defect of uj, which is denoted by def uj, is defined as 

defoj = 2 h+1 - 1 - V, 

where V is the number of vertices of the tree of uj. 

So, if the height of u is h, then the defect is the number of vertices that 
need to be added to the tree of ui in order to get the tree of 5h ■ Thus the 
defect is if and only if ui is a derived word, and we have def 74 = 8 and 
def [73, 73] = 4. 

Let now ip and uj be two words, and let F be the free group to which 
<p belongs. We say that <p is uj-valued if <p 6 F w . If this is the case, then 
we have G v C for every group G, and in particular <p{G) < uj{G). For 
example, 82 is 73-valued, but not conversely. 

Definition 2.6. Let ip and uj be two outer commutator words. Then: 

(i) We say that u is a constituent of 92 if uj is, up to equivalence, the 
label of a vertex in the tree of ip. 

(ii) We say that <p is an extension of uj, or that uj is a restriction of (p, 
if the tree of (p is an upward extension of the tree of uj (simply as a 
tree, without labels). 

Thus, in order to get an extension of uj, we only need to draw new binary 
trees at some of the vertices which are labelled by indeterminates in the tree 
of uj. Equivalently, a restriction of uj is obtained by selecting a number of 
vertices and erasing all branches lying on top of these vertices in the tree of 

UJ. 

[74,52] 

Figure 3. An extension of [74,^2]- 



In Figure O the black tree represents the word ui = [74, 82], and the exten- 
sion of uj which is obtained by adding the grey trees is p = [[73, 73], [82, 73]]. 
Without having to check the commutator structure of these two words, the 
trees show that <p is unvalued. On the other hand, observe that the derived 
word 5h is an extension of all words of height less than or equal to h. 

The following lemma is straightforward. 
Lemma 2.7. Let p and uj be two outer commutator words. Then: 
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(i) If uj is a constituent of ip, then <p(G) < uj(G). 

(ii) If ip is an extension of uj, then p is u-valued. 

3. Proof of Theorems A and B 

Before proceeding to the proof of Theorem B, we need some lemmas. 
First, we need to introduce the following concept. 

Definition 3.1. Let T be the tree associated to an outer commutator word 
uj. A subset S of vertices of T is called a section of T if S is maximal (with 
respect to inclusion) subject to the condition that S does not contain two 
vertices which are one a descendant of the other. Equivalently, in terms of 
labels, this means that every indeterminate involved in u> appears in exactly 
one word uj v with v G S. 

Visually, taking a section is nothing but cutting the tree from side to side. 




[73,73,52] 



Figure 4. A section of [73,73,^2]- 

A very natural way of obtaining a section is by cutting a tree below level i, 
that is, we consider the section S containing all vertices at level i + 1 and all 
the vertices of the tree lying below level i + 1 labelled by an indeterminate. 
This is the type of section that we will use in the proof of Theorem B. 




[74,74] 

Figure 5. Section of [74,74] by cutting below level 0. 

If uj = [a, P] and 7 are two outer commutator words, then by the Three 
Subgroup Lemma, we have 

HG), 7 (G)] <ir^(G)7r^(G), 

where tt^ = [[a, 7],/?] and ir^ = [a, [(3, 7]] are also outer commutator 
words. Observe that the tree of ir^ is very similar to that of uj: one only 
needs to replace the tree on top of the vertex labelled a with the tree corre- 
sponding to [0,7]. The same happens with tt^ 2 \ with (3 playing the role of 
a. The following lemma is a generalization of this fact; instead of stopping 
at the vertices labelled a and (3, by iterating the process we can reach an 
arbitrary section of the tree. 



8 



GUSTAVO A. FERNANDEZ-ALCOBER AND MARTA MORIGI 



Lemma 3.2. Letuo be an outer commutator word, and letT be the tree ofu. 
If '7 is another outer commutator word, then for every v £ T, we define ir^ 
to be the word whose tree is obtained by replacing the tree of u> v at vertex v 
with the tree of [0^,7]. Then, for every section S of T , and for every group 
G, we have 

HG) n {G)]<Y[^{G). 

Proof. We argue by induction on the number n of vertices of 5*. The case 
n = 1 is obvious, so we assume that n > 2. We also observe that the product 
rises' ^ (GQ depends only on the subgroups 7r^(G), for v G S, and not 
on the order in which they appear, since all those subgroups are normal in 
G. Let p be a vertex in S which has maximum distance from the root, let q 
be its companion vertex, and let u be the immediate descendant of p and q. 
Since each of the indeterminates involved in the word lo u appears in exactly 
one of the words u v with v G S, it necessarily follows from the assumption 
about p that q £ S. Now let S' be the section of T which is obtained from S 
by deleting p and q, and inserting u. By applying the induction hypothesis 
to S', we have 

(1) [u(G)MG)]<n^(G) [J Tr^(G). 

On the other hand, by the Three Subgroup Lemma, 

K(G), 7 (G)] = [u p (G),u q (G), 7(G)] 

< [[ui p (G) , 7 (G)] , Lo q (G)] [uj p (G) , [uj q (G) , 7(G)]] , 

and consequently tt^ u \G) < 7r^-'(G)7r^- ) (G), which completes the proof by 
©■ □ 

Definition 3.3. Let u be an outer commutator word, and let G be a group. 
A series of normal subgroups of G is said to be power-closed generated (or a 
PCG-series, for short) with respect to ui if every section H/J of the series 
is abelian and can be generated by values of u in G/J all of whose powers 
are again values of u; in G/J. 

It is clear that a series Hq < H\ < ■ ■ ■ < H n of normal subgroups of 
a group G is a PCG-series with respect to u if and only if, for every i = 
1, . . . ,n, the quotient Hi/Hi_\ is abelian and we can choose a subset Si of 
G w such that: 

(PI) Hi/Hi-i = (xHi-t I x G Si), i.e. H { = {S^H^. 

(P2) x n Hi-\ G (G/ Hi^i)^ for every x G Si and every n G Z. 

Furthermore, since the set of w-values is closed under conjugation, and the 
subgroups in a PCG-series are normal, we may assume if necessary that Si 
is a normal subset of G. 

Obviously, any PCG-series with respect to u> beginning from the trivial 
subgroup is contained in u>(G), and the content of Theorem B is precisely 
that, starting from 1, we can always reach 10(G) with a PCG-series provided 
that G is soluble. 
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Moreover, we note that if <p is another outer commutator word which is 
w-valued, then any PCG-series with respect to tp is also a PCG-series with 
respect to u. We will repeatedly use this fact in the sequel without further 
mention. 

Now we state two more lemmas that we need for the proof of Theorem B. 

Lemma 3.4. Let uj be an outer commutator word, let G be a group, and let 
K and L be two normal subgroups of G. If there are two PCG-series with 
respect to lo from 1 to K and from 1 to L, then there is also a PCG-series 
from 1 to KL. 

Proof. If H/J is a normal abelian section of G generated by values of u such 
that all of their powers are also values of uj, then also the section HL/JL 
has this property. Now the result readily follows. □ 

Lemma 3.5. Let a and (3 be two outer commutator words, and let G be a 
group. If there is a PCG-series from K to L in the group G with respect 
to a, and if [L, (3(G), L] = 1, then by taking commutators with (3(G) we 
obtain a PCG-series from [K, (3(G)} to [L, /3(G)] with respect to [a, (3]. In 
particular, if there is a PCG-series from 1 to a(G) with respect to a, and 
if [a (G), (3(G), a (G)} = 1, then there is also a series from 1 to [a(G), (3(G)] 
with respect to [a, (3]. 

Proof. We first note that the condition [L, (3(G), L] = 1 implies in particular 
that [L, (3(G)] is abelian, so that any section of this group is also abelian. 

Let K = Hq < Hi < • • • < H n = L be a PCG-series with respect to a. 
We fix an integer i from 1 to n, and choose a normal subset S% of G which 
is contained in G a , and which satisfies properties (PI) and (P2) above. We 
claim that the set Tj = | x G Si, y G Gg} satisfies (PI) and (P2) 

for the section [Hi, (3(G)]/[Hi^\ , (3(G)] and the word [a, 0\. This proves the 
result, since is contained in G\ a m. 

First of all, since Si and Gg are normal subsets of G, the same is true 
about Tj. Then N = (Ti)[Hi_\, (3(G)] is a normal subgroup of G, and Hi 
and (3(G) clearly commute modulo N. Thus [Hi, (3(G)] < N, and property 
(PI) follows. Now let [x,y] be an element of Tj, with x £ Si and y £ Gg. 
By using the fact that [L,(3(G),L] = 1, we have [x,y] n = for every 

n£Z. Since Si satisfies (P2), we can write x n = a n b n , with a n G -ff^-i and 
b n G G a . Thus 

[x,y] n = [a n b n ,y] = [a n ,y] K [b n ,y] = [b n ,y] (mod [H^, [3(G)]), 
which proves that (P2) holds for Tj. □ 

Now we can easily see that Theorem B is true for derived words. This fact 
is already proved in Lemma 3.3 of pQ, and the proof we provide is essentially 
the same. We include it here for the sake of completeness, and because the 
use of Lemma 13.51 simplifies the presentation. In the following, G^ 1 will 
denote as usual the z-th term 8i(G) of the derived series of a group G. 

Theorem 3.6. Let G be a soluble group. Then, for every i > 0, there exists 
a PCG-series from 1 to with respect to Si . Furthermore, if the derived 
length of G is d, there is such a series of length at most 2 d — 2* if d> i, or 
Qifd<i. 
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Proof. We first deal with the particular case when G^ is abelian. Let us 
prove, by induction on i, that there is a PCG-series of length 2 l from 1 to 
GW with respect to 5i. This is obvious for i = 0, so we assume that i > 1 
and that the result holds for i — 1. If we apply it to the group G' , we obtain 
a PCG-series of length at most 2 l_1 from 1 to G^> with respect to Si—%. 
By Lemma 13. 5| with a = (3 = S{—i, it follows that there is a PCG-series 
of the same length from 1 to [G® , G^ -1 )] with respect to 5{. On the other 
hand, if we use again the result for <5j_i, but in this case with the group 
G/G®, we get a PCG-series of length at most 2 i_1 from G® to G^" 1 ) 
with respect to 8i-\. Another application of Lemma [3.51 yields a PCG-series 
from [GW.G^- 1 )] to [G^" 1 ), G^" 1 )] = G^ with respect to Now we can 
connect the two PCG-series with respect to <Jj that we have obtained so far, 
and the induction is complete. 

Let us now deal with the general case. If i > d there is nothing to prove, 
so we assume that i < d. By the last paragraph, for every j between i and 
d— 1 there is a PCG-series from G^ +1 ^ to G&' with respect to 5j, of length 
2- ? . Since 5j is <5j-valued for j > i, by connecting these series we obtain a 
PCG-series from 1 to GW with respect to ^ of length at most 2 d — 2 l , as 
desired. □ 



We can now prove Theorem B for arbitrary outer commutator words. 

Proof of Theorem B. We concentrate on proving the existence of a PCG- 
series with respect to oj from 1 to 00(G)] a close examination of the proof that 
follows shows that the length of the PCG-series constructed only depends 
on oj and on the derived length of G, and not on the particular group G. 

We argue by double induction: we first use induction on the heigth of the 
word oj, and then, for a fixed value of the height, induction on the defect 
of oj. If oj has height 0, then oj = x\ and the result is trivially true. Now 
assume that h = ht(w) > 1 and that the result has been proved for any 
outer commutator word whose height is less than h. If def(w) = then oj 
is a derived word, and the result holds by Theorem I3.6[ so we assume that 
def(u) > 0. Let us write oj = [a,/5], where a and are outer commutator 
words of height smaller than h. Then we have a PCG-series from 1 to 
a(G) with respect to a, and another one from 1 to (3(G) with respect to 
f3. If we can reduce ourselves to the case that [oj(G), a(G)] = 1 or that 
[u(G), /3(G)] = 1, then the proof of the theorem will be complete by invoking 
Lemma 13.51 

Let $ be the (finite) set of all outer commutator words of height h which 
are a proper extension of oj. By the induction hypothesis on the defect, 
for every 93 in $, there is a PCG-series from 1 to <p(G) with respect to oj, 
since (p is unvalued according to Lemma 12.71 By using Lemma 13.41 we can 
combine the series corresponding to all different words in <5, and get a single 
PCG-series whose last term L contains <p(G) for all <p in <£. For the theorem 
to be proved, it suffices to find the desired PCG-series with respect to oj in 
the quotient G/L, and so we may assume in the remainder that tp(G) = 1 
for all ip in <£. We cannot guarantee in general that [oj,a] or [u,0\ belong 
to the set <£. However, we prove below that at least one of the subgroups 
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[oj(G), 01(G)] and [oj(G), (3(G)] is contained in a product of verbal subgroups 
corresponding to words in <£, and is consequently equal to 1, as desired. 

Let i be the largest integer for which there is a vertex in the tree of uj at 
level i with label Si. Note that 1 < i < h, since uj is not a derived word. Let 
S be the section of the tree of uj obtained by cutting the tree below level 
i, so that S contains all vertices at level i + 1 and all the vertices of the 
tree lying below level i + 1 which are labelled with an indeterminate. For 
every vertex v in S, we construct a word uj^ as follows. If the label uj v of 
v is not an indeterminate, then we can write uj v = [uj p ,uj q ], where p and q 
are the companion vertices at level i having v as immediate descendant. By 
the maximality of i, one of these vertices is labelled with a word which is 
different from <5j. For simplicity, let us assume that this happens for q, the 
vertex on the right (the argument is exactly the same otherwise) . We define 
uj^ to be the word whose tree is obtained by replacing u q with S{ in the 
tree of uj. Thus the label of uj( v ' at the vertex v is the commutator [u;p,<5j]. 
On the other hand, if uj v is an indeterminate, then is defined simply by 
putting the tree corresponding to S{ on top of the vertex v in the tree of uj. 




Figure 6. The two different cases for the construction of 
uj( v > with v E S. Observe that i = 1 in this example. 

In any case, it is clear that ht(u^) = h and that is a proper extension 
of uj, so that u}( v ' belongs to <3?. Consequently, we have wW(G) = 1 for every 
vertex v in the section S. 

On the other hand, if we apply Lemma f3.2l to the section S with Si playing 
the role of 7, then we have 

(2) MG),^(G)]<n^(G). 

Here, TT^ IS the word whose tree is obtained by inserting the tree of [u v , Si] 
at vertex v in the tree of uj. Now, it is easy to compare the two words uj^ 
and tt( v >: they look the same at all vertices of the original tree of to, except 
for the vertex v, where n^ ' has the label and has either [u p , Si] 

or Si. In any of the two cases, we have 

(k (v) )v(G) < (J V) ) V (G), 

and then, since ir( v > and u)( v > have the same labels outside the tree above v, 
also 

vr^(G) < J V \G). 

Since this happens for all vertices in S, it follows from ([2]) that [uj(G), Si(G)] = 
1. Now, by the definition of i, the derived word Si is a constituent of either 
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a or (3, and consequently either [uj(G),a(G)] = 1 or [u(G), f3(G)] = 1. As 
explained above, this completes the proof. □ 

Finally, we derive Theorem A from Theorem B by adapting the argument 
given by Brazil, Krasilnikov and Shumyatsky in pQ for the case of derived 
words. We will need Dietzmann's Lemma, whose proof can we found in [5j 
14.5.7]. 

Lemma 3.7. If G is a group and X = {ci,...,c n } is a normal subset, 
then every element y G (X) is of the form y = Y\7=i c ?> f or some integers 
r 1; . . . ,r n . 

Proof of Theorem A. Suppose first that G is soluble. If uj(G) = 1 the result 
is trivial, so we may assume that oo(G) is not the identity subgroup. By 
Theorem B, there is a PCG-series 

1 = H <H 1 <-- - < H n = u(G). 

Since G w is finite, each of the abelian quotients Hi/Hi_\ can be generated 
by a finite number of values of u all of whose powers are again values of u. 
Then we can refine this PCG-series to a subnormal series 

1 = G < Gi < • • • < G k = u(G) 

in which every section Gi/Gi-i is a non-trivial cyclic group consisting en- 
tirely of values of u. Observe that, contrary to the original PCG-series, the 
length of this refined series may depend on the group G (more precisely, 
on the rank of G); however, this will have no effect in the proof. Now, for 
every non-trivial element x in Gi/Gi-%, there exists y G G^ \ {1} such that 
x = yGi-i, and consequently 

|Gi/Go| + \G 2 /Gx\ + ■■■ + \G k /G k -i\ <m + k-l. 

Observe that log 2 \Gi/Gi-x\ < \Gi/Gi-%\ — 1 for all i, since \Gi/Gi-x\ > 2. 
Hence 

k 

\u(G)\ =f[\G l /G i - 1 \ = 2Stilog 2 l^/G 1 . 1 | < aEttlGi/Gi-xl-fc < 2 m-i ) 
i=l 

which proves part (i) of Theorem A. 

Now assume that G is non-soluble. Observe that m > 3 in this case, 
since otherwise u>(G) is cyclic and G is soluble. If h is the height of lu, 
then 5h is w-valued, and the same holds for 5h+i- Let = I- Then 

\{G /G^^)^ < m — I + 1, and by the bound for the soluble case, it follows 
that 

\Lu(G)/G {h+ V\ < 2 m ~ l . 

Now we bound the order of G^ h+1 ^ . We claim that the order of an element 
g G G$ h+1 is at most (m — l)(m — 2). Of course, we may assume g ^ 1. Let 
us write g = [a,b] with a,b G G$ h , and consider the subgroup H = (a,b). 
Let C = Ch{o)- Since a G G^ \ {1}, it has at most m — 1 conjugates in 
G, and consequently \H : C\ < m — 1. Now G permutes the m — 1 non- 
trivial values of u in G, and leaves the element a fixed by definition. Thus 
|G : C c (b)\ <m-2, and consequently \H : Z(H) \ = \H : C H {a) n C H {b)\ < 
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(m — 1)(to — 2). By applying Schur's Theorem [51 10.1.4] to H, it follows 
that the exponent of H' is at most (to — 1)(to — 2), which proves the claim. 

Let G$ h+1 = {cq, ci, C2, . . . , where Co = 1. By Lemma 12.21 the 

set Gs h+1 is symmetric, so we can assume that a = c~ l for every i = 
0, . . . , t for some < t < I — 1 and that Ct+2j = c t+2j-i f° r eacn i = 
1, . . . , (I — 1 — t)/2 (note that I — 1 — £ is even). Since G$ h+1 is a normal 
subset of G, it follows from Lemma 13.71 that every element u; of Sh+i (G) 
is of the form c" 1 c^ 2 • • • c/^ 1 , where 1 < n, < |q|. Now, we have two 
choices for each m with l<£<£(ifi>l, otherwise we have nothing to 
choose) and at most (to — 1)(to — 2) choices for each product of the form 

"t+2,-1 «t+2, _ n t+ 2j-l-n t+2 j q 
H+2j-l c t+2j ~ c t+2j-l ■ DU 

| G (h+l)| < 2 *[( m _ i)( m _ 2)] i= a= i < 2*(m - l)'" 1_t < (m - 

and we conclude that 

| w(G)| = | w(G)/G (h+1) | | < 2 m "'(TO - I)'" 1 

< (m-l) m -\ 

since to > 3. This completes the proof of Theorem A. □ 



4. Appendix: Existence of bounds via ultraproducts 

In this appendix, we give two different proofs of the following result, 
mentioned in the introduction. 

Theorem 4.1. Let uj be a concise word. Then, there exists a function 
f : N — ► N such that, ifG is a group in which \G U \ < m, then \lo(G)\ < /(to). 

For the convenience of the reader, we begin by recalling briefly the con- 
struction of ultraproducts of groups. To this end, we need the concept of an 
ultrafilter. (See [2] for an account on ultraproducts from an algebraic point 
of view.) 

Definition 4.2. A filter over a non-empty set / is a non-empty family U of 
subsets of / such that: 

(i) The intersection of two elements of U also lies in U. 

(ii) If P is in U and P C Q, then also Q is in U. 
(hi) The empty set does not belong to U. 

The filter U is called principal if it consists of all supersets of a fixed subset 
of /, and it is called an ultrafilter if it is maximal in the set of all filters over 
/ ordered by inclusion. 

Equivalently, a filter hi over / is an ultrafilter if and only if, for every 
subset J of /, either J E U or / \ J G hi. By (i) and (hi) above, only one of 
these conditions holds. 

The existence of non-principal ultrafilters is independent of the Zermelo- 
Fraenkel axioms for set theory. It can be easily proved by using the Axiom 
of Choice, but is in fact weaker than that. On the other hand, an ultrafilter 
over / is non-principal if and only if it contains all cofinite subsets of /. 
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Definition 4.3. Let I be a non-empty set, and let U be an ultrafilter over /. 
The ultra-product modulo U of a family Q = {Gi}i e i of groups is the quotient 
of the cartesian product Yl ieI G{ (i.e. the unrestricted direct product) by 
the subgroup consisting of all tuples (gi)iei such that the set 

{i£l\gi = l} 

lies in U. We denote this ultraproduct by Qu- 

Thus two tuples (<?i)iei and (hi)i^i of the cartesian product define the 
same element of the ultraproduct Qu if and only if the set of indices % for 
which gi = hi lies in U. In the remainder of the paper, we use the bar nota- 
tion for the image of an element or a subset of Y\ ie i Gi in an ultraproduct. 

The first proof of Theorem 14.11 that we present is based on the following 
particular case of Los's Theorem from model theory. (See Theorem 3.1 and 
Corollary 3.2 of 0.) 

Lemma 4.4. Let Q = {Gj}j e / be a family of groups and letU be an ultrafilter 
over I. Then, a sentence in the first-order language of groups holds in the 
ultraproduct Qu if and only if the set of alii £ I for which the sentence holds 
in Gi is a member ofU. 

Recall that the width of a word w in a group G is the supremum, as g 
ranges over the verbal subgroup lo(G), of the minimum length of all decom- 
positions of g as a product of elements of G^ U G~ . Obviously, if G is finite, 
then uj has finite width in G. We may similarly speak of the width of u over 
a subset S of oj(G), by taking the supremum only over elements of S. 

First proof of Theorem \4-l\ This proof is based on the following two facts: 

(i) For a given positive integer m, the property that u takes at most m 
values in a group can be expressed as a sentence in the first-order language 
of groups. More precisely, if to involves k indeterminates, we may use the 
following formula: 

m 

3g u ■ ■ ■ 3g lk ■ . . 3g ml . . . 3g mk Vx 1 . . .Mx k \J u){x\, ...,x k )= u(ga, . . .,g ik )- 

i=i 

(ii) For a given positive integer n, the property that oo has width at most 
n in a group can be expressed as a sentence in the first-order language of 
groups. To see this, note that this property is equivalent to every product 
of n + 1 elements of G^ U G~ l being also a product of n elements of that 
set. 

Assume, by way of contradiction, that there is an infinite sequence G n 
of groups such that | (G^)^ | < m for all n G N but |w(G n )| goes to infinity. 
Choose a non-principal ultrafilter U, and let Q = Qu- Then, by Lemma 
14.41 and (i), we have \Qui\ < m - It follows that |w(Q)| is finite, since lu 
is concise. Then u has finite width, say k, in Q. By Lemma 14.41 again, 
this time used together with (ii), there is a subset J € U such that uj has 
width at most k in G n for all n € J. Since KGn)^! < m, it follows that 
|^(G n )| < (2m) k for every n € J. This is incompatible with the condition 
lmin^oo \uj(G n )\ = oo: since U is a non-principal ultrafilter, every cofinite 
subset of N has non-empty intersection with J. □ 
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Now we give a second proof of Theorem 14.11 which only needs the def- 
inition and basic properties of ultraproducts, and which is independent of 
Los's Theorem. This proof basically follows an argument communicated to 
us by Avinoam Mann. 

Lemma 4.5. Let Q = {G n } n £^ be a family of groups, and for every n G N, 
let S n be a non-empty finite subset of G n . If hi is an ultrafilter over~N then 
the cardinality of the image of S = n„eN S n * n the ultraproduct Qu is given 
by 

(3) \S\ = sup ( min |5 n | ), 

provided that the supremum is finite, and S is infinite otherwise. In partic- 
ular: 

(i) If \S n \ < k for all n, then \S\ < k. 

(ii) If the ultrafilter hi is non-principal and \S n \ > k for big enough n, 
then \S\ > k. 

Proof. Let J be an arbitrary element of U, and put m = min ng j \S n \. Let 
us prove that |5| > m, which gives one of the inequalities in ([3]). For every 
n E N, we consider m elements sh, , . . . , s n E S n , which we take different 
if n E J and arbitrary if n £ J. Let 

S M = (4 4) )ri G N, for every % = 1, . . . ,m. 

We claim that the images of and s^) i n Qy are different for all i ^ j. 
Otherwise, the tuples sW and sW coincide on a subset but they are 

different by construction on J G U. Hence J C and, by (ii) of the 

definition of a filter, we also have N \ X £W. Thus both X and N \ X lie in 
U, which is impossible since hi is an ultrafilter. This proves our claim, and 
consequently that |5| > m. Observe that this also proves that S is infinite 
if the supremum in ([3|) is not finite. 

For the reverse inequality, put r = sup JeU (min ng j \S n \), and assume 
that r is finite. By way of contradiction, suppose that \S\ > r + 1. If 
sW, . . . , s( r+1 ) are elements of S whose images in Qu are all different, then 
for all i, j G {1, . . . , r + 1}, i ^ j, the set 

Xy = {n G N | S « + s®} 

belongs to IA. Hence the intersection J of all the Xij is also in U. Now 

observe that, if n G J, then s n , . . . , s n are all different and, consequently, 
\S n \ > r + 1. It follows that min ng j > r + 1, which is a contradiction 
with the definition of r. 

Finally, observe that (i) is obvious, and that (ii) follows because a non- 
principal ultrafilter contains all cofinite subsets. □ 

If u is a word and {Gi}i^i is an infinite family of groups, it is not always 
the case that ^(Ilig/ ^i) = Yliei ^(Ci) 5 and only the inclusion C may be 
guaranteed. Our next lemma is an approximation to the reverse inclusion. 

Lemma 4.6. Letoj be a word, and let {Gj}j g / be a family of groups. Suppose 
that Si C (jj{Gj) for every i £ I, and that the width of uj can be uniformly 
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bounded over all the subsets Si. Then 

n>^(n 

Proof. Let g = (gi)iei € Y\ ie j Si. If the width of u is at most k over all 
the subsets Si, then every gi can be written as a product of k elements 
xf~\ . . . , of (Gj)^ U (Gj)j 1 . We use these elements to define 2k elements 
of Gi as follows: for every j = 1, . . . , k, we put 

1 1, otherwise, ' l^^i otherwise. 

Then gfi~ X) G (Gi),,, gf j) G (G^ 1 and 9i = . . . gf k) for every i G I. If 
we put 5 M = ( 5 J r) )ie/ for r = 1, . . . , 2fc, it follows that s^" 1 ) G G^ 
and 5 ( 2 J) G (n ieJ Gi)- 1 for j = l,...,k, and also that 5 = gi 1 ) . . . g( 2k l 
Thus g G w(IIie/ ^*)> as desired. □ 

Lemma 4.7. Lei u be a word, and let G be a group such that \lo(G)\ > fc, 
where k is a positive integer. Then, there exists a subset S of to(G) such 
that \S\ > k and to has width less than k over S. 

Proof. For every integer i > 0, let Tj be the subset of all elements of lu(G) 
of (minimum) length i with respect to the set of generators G u U Gj 1 . Put 
T = U^Tq 1 Tj. If Ti is non-empty for every i = 0, . . . , k — 1, then |T| > and 
we may take S = T. If, on the contrary, Tj is empty for some i = 0, . . . , k— 1, 
then uj has width at most i — 1 in G, and then we may take S 1 = uj{G). □ 

Second proof of Theorem \4-l\ By way of contradiction, assume that there is 
a family {G„} ne N of groups such that |(G n ) w | < m for all n, but nevertheless 
lim^^oo |w(G n )| = oo. Let us fix an arbitrary positive integer k. According 
to Lemma H771 if n is big enough, there is a subset S n of uj{G n ) such that 
\S n \ > k and a; has width less than k over S n . We complete the sequence 
{5 n }neN by choosing the first terms equal to 1. Now, if G = n«eN ^ n anc ^ 
s = linen S n, w e have 

G u = ]\(G n ) u , and SQu(G), 

nSN 

where the last inclusion follows from Lemma 14.61 Consider now a non- 
principal ultrafilter U over N, and let Q = Qjj be the corresponding ultra- 
product. Then Q w = (G u ) and uj(Q) = uj{G) 5 S. By applying Lemma 
14.51 we obtain that < m and |w(Q)| > fc. Since k is arbitrary, we get 
|w(Q)| = oo, which is a contradiction, since the word oj is concise. □ 
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